print("Welcome back, Let's start the assessment")
Welcome back, Let's start the assessment
import pandas as pd
import seaborn as sns
!pip install openpyxl
Collecting openpyxl Downloading openpyxl-3.1.5-py2.py3-none-any.whl.metadata (2.5 kB) Collecting et-xmlfile (from openpyxl) Downloading et_xmlfile-1.1.0-py3-none-any.whl.metadata (1.8 kB) Downloading openpyxl-3.1.5-py2.py3-none-any.whl (250 kB) ---------------------------------------- 0.0/250.9 kB ? eta -:--:-- - -------------------------------------- 10.2/250.9 kB ? eta -:--:-- ---- ----------------------------------- 30.7/250.9 kB 1.3 MB/s eta 0:00:01 ------ -------------------------------- 41.0/250.9 kB 393.8 kB/s eta 0:00:01 --------- ----------------------------- 61.4/250.9 kB 409.6 kB/s eta 0:00:01 -------------------------- ----------- 174.1/250.9 kB 876.1 kB/s eta 0:00:01 ----------------------------------- -- 235.5/250.9 kB 962.7 kB/s eta 0:00:01 -------------------------------------- 250.9/250.9 kB 962.7 kB/s eta 0:00:00 Using cached et_xmlfile-1.1.0-py3-none-any.whl (4.7 kB) Installing collected packages: et-xmlfile, openpyxl Successfully installed et-xmlfile-1.1.0 openpyxl-3.1.5
!pip install seaborn
!pip install matplotlib
Requirement already satisfied: seaborn in c:\users\harsh mittal\anaconda3\lib\site-packages (0.13.2) Requirement already satisfied: numpy!=1.24.0,>=1.20 in c:\users\harsh mittal\anaconda3\lib\site-packages (from seaborn) (1.26.4) Requirement already satisfied: pandas>=1.2 in c:\users\harsh mittal\anaconda3\lib\site-packages (from seaborn) (2.2.2) Requirement already satisfied: matplotlib!=3.6.1,>=3.4 in c:\users\harsh mittal\anaconda3\lib\site-packages (from seaborn) (3.9.2) Requirement already satisfied: contourpy>=1.0.1 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.2.0) Requirement already satisfied: cycler>=0.10 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (4.53.1) Requirement already satisfied: kiwisolver>=1.3.1 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.4.5) Requirement already satisfied: packaging>=20.0 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (23.2) Requirement already satisfied: pillow>=8 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (10.3.0) Requirement already satisfied: pyparsing>=2.3.1 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (3.1.2) Requirement already satisfied: python-dateutil>=2.7 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (2.9.0.post0) Requirement already satisfied: pytz>=2020.1 in c:\users\harsh mittal\anaconda3\lib\site-packages (from pandas>=1.2->seaborn) (2024.1) Requirement already satisfied: tzdata>=2022.7 in c:\users\harsh mittal\anaconda3\lib\site-packages (from pandas>=1.2->seaborn) (2023.3) Requirement already satisfied: six>=1.5 in c:\users\harsh mittal\anaconda3\lib\site-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.4->seaborn) (1.16.0) Requirement already satisfied: matplotlib in c:\users\harsh mittal\anaconda3\lib\site-packages (3.9.2) Requirement already satisfied: contourpy>=1.0.1 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (1.2.0) Requirement already satisfied: cycler>=0.10 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (4.53.1) Requirement already satisfied: kiwisolver>=1.3.1 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (1.4.5) Requirement already satisfied: numpy>=1.23 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (1.26.4) Requirement already satisfied: packaging>=20.0 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (23.2) Requirement already satisfied: pillow>=8 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (10.3.0) Requirement already satisfied: pyparsing>=2.3.1 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (3.1.2) Requirement already satisfied: python-dateutil>=2.7 in c:\users\harsh mittal\anaconda3\lib\site-packages (from matplotlib) (2.9.0.post0) Requirement already satisfied: six>=1.5 in c:\users\harsh mittal\anaconda3\lib\site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)
file_path = 'D:/PI/sada_assessment_excel.xlsx'
df = pd.read_excel(file_path)
# Display the first few rows of the dataframe
df
| (Parent) ASIN | (Child) ASIN | Title | SKU | Sessions - Total | Sessions - Total - B2B | Session Percentage - Total | Session Percentage - Total - B2B | Page Views - Total | Page Views - Total - B2B | ... | Featured Offer (Buy Box) Percentage | Featured Offer (Buy Box) Percentage - B2B | Units Ordered | Units Ordered - B2B | Unit Session Percentage | Unit Session Percentage - B2B | Ordered Product Sales | Ordered Product Sales - B2B | Total Order Items | Total Order Items - B2B | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | B09M3HDJV5 | B09M3HDJV5 | SAADAA Women Cotton Pant Stretchable Regular S... | SD_CPBE_XL | 50911 | 434 | 0.0388 | 0.0275 | 64844 | 567 | ... | 0.9963 | 0.9935 | 994 | 16 | 0.0195 | 0.0369 | ₹ 7,68,432.13 | ₹ 12,784.00 | 990 | 15 |
| 1 | B09M3GTK48 | B09M3GTK48 | SAADAA Women Cotton Pant Stretchable Regular S... | SD_CPWH_XL | 18738 | 263 | 0.0143 | 0.0167 | 24568 | 341 | ... | 0.9954 | 0.9945 | 923 | 9 | 0.0493 | 0.0342 | ₹ 7,08,748.65 | ₹ 7,147.24 | 918 | 9 |
| 2 | B09M3H45LV | B09M3H45LV | SAADAA Women Cotton Pant Stretchable Regular S... | SD_CPBE_L | 25855 | 342 | 0.0197 | 0.0217 | 35132 | 488 | ... | 0.9950 | 0.9960 | 826 | 12 | 0.0319 | 0.0351 | ₹ 6,35,619.40 | ₹ 9,508.05 | 822 | 12 |
| 3 | B09M3DFC83 | B09M3DFC83 | SAADAA Women Cotton Pant Stretchable Regular S... | SD_CPWH_L | 17877 | 238 | 0.0136 | 0.0151 | 23382 | 330 | ... | 0.9949 | 0.9932 | 745 | 9 | 0.0417 | 0.0378 | ₹ 5,71,969.78 | ₹ 7,151.05 | 742 | 9 |
| 4 | B09M3HMG4T | B09M3HMG4T | SAADAA Everyday Cotton Pants Women Cotton Trou... | SD_CPBL_L | 11689 | 170 | 0.0089 | 0.0108 | 14801 | 211 | ... | 0.9983 | 1.0000 | 709 | 8 | 0.0607 | 0.0471 | ₹ 5,40,178.22 | ₹ 6,392.00 | 700 | 8 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 494 | B0CVB4SBZY | B0CVB4SBZY | SAADAA V-Neck Tops for Women, Plain, Sleeveles... | SDRSTNB_L | 255 | 1 | 0.0002 | 0.0001 | 286 | 2 | ... | 1.0000 | 0.0000 | 7 | 0 | 0.0275 | 0.0000 | ₹ 5,593.00 | ₹ 0.00 | 7 | 0 |
| 495 | B0CVB4ZN2C | B0CVB4ZN2C | SAADAA V-Neck Tops for Women, Plain, Sleeveles... | SDRSTMA_3XL | 154 | 1 | 0.0001 | 0.0001 | 191 | 1 | ... | 0.9944 | 1.0000 | 7 | 0 | 0.0455 | 0.0000 | ₹ 5,593.00 | ₹ 0.00 | 7 | 0 |
| 496 | B0CVB6DYFW | B0CVB6DYFW | SAADAA V-Neck Tops for Women, Plain, Sleeveles... | SDRSTNB_3XL | 171 | 1 | 0.0001 | 0.0001 | 197 | 1 | ... | 0.9946 | 0.0000 | 7 | 0 | 0.0409 | 0.0000 | ₹ 5,593.00 | ₹ 0.00 | 7 | 0 |
| 497 | B0CVB6K8SR | B0CVB6K8SR | SAADAA V-Neck Tops for Women, Plain, Sleeveles... | SDRSTBE_2XL | 484 | 7 | 0.0004 | 0.0004 | 600 | 7 | ... | 0.9964 | 1.0000 | 7 | 0 | 0.0145 | 0.0000 | ₹ 5,589.20 | ₹ 0.00 | 7 | 0 |
| 498 | B0CVB73N6V | B0CVB73N6V | SAADAA V-Neck Tops for Women, Plain, Sleeveles... | SDRSTJA_XL | 270 | 6 | 0.0002 | 0.0004 | 312 | 7 | ... | 1.0000 | 1.0000 | 7 | 0 | 0.0259 | 0.0000 | ₹ 5,593.00 | ₹ 0.00 | 7 | 0 |
499 rows × 22 columns
df.shape
(499, 22)
df.dtypes
(Parent) ASIN object (Child) ASIN object Title object SKU object Sessions - Total int64 Sessions - Total - B2B int64 Session Percentage - Total float64 Session Percentage - Total - B2B float64 Page Views - Total int64 Page Views - Total - B2B int64 Page Views Percentage - Total float64 Page Views Percentage - Total - B2B float64 Featured Offer (Buy Box) Percentage float64 Featured Offer (Buy Box) Percentage - B2B float64 Units Ordered int64 Units Ordered - B2B int64 Unit Session Percentage float64 Unit Session Percentage - B2B float64 Ordered Product Sales object Ordered Product Sales - B2B object Total Order Items int64 Total Order Items - B2B int64 dtype: object
1. Data Gathering:
Review the provided dataset to understand the key variables (e.g., ASIN, Sessions, Units Ordered, Sales, etc.).
## Write about it
1.(Parent) ASIN: The Amazon Standard Identification Number (ASIN) of the parent product, which represents a group of related products (e.g., different sizes, colors).
2.(Child) ASIN: The ASIN for the specific variation of the parent product (e.g., specific size or color).
3. Title: The title or name of the product being sold.
4. SKU: Stock Keeping Unit, a unique identifier used by the seller to track inventory.
5.Sessions - Total: The number of sessions (visits) to the product page from customers, representing traffic.
6.Sessions - Total - B2B: The number of sessions specifically from B2B (business-to-business) customers.
7.Session Percentage - Total: The percentage of total sessions out of all sessions across different products or variations.
8.Session Percentage - Total - B2B: The percentage of B2B sessions out of all sessions for this product.
9.Page Views - Total: The total number of page views for the product. It includes multiple views by the same customer in one session.
10.Page Views - Total - B2B: The total number of page views from B2B customers.
11.Page Views Percentage - Total: The percentage of total page views out of all page views across different products or variations.
12.Page Views Percentage - Total - B2B: The percentage of B2B page views out of all page views for this product.
13.Featured Offer (Buy Box) Percentage: The percentage of times the product won the Buy Box, which is the section on an Amazon product page where customers can add the item to their cart or buy directly.
14.Featured Offer (Buy Box) Percentage - B2B: The percentage of Buy Box wins for B2B customers.
15.Units Ordered: The total number of units ordered for the product.
16.Units Ordered - B2B: The number of units ordered by B2B customers.
17.Unit Session Percentage: The percentage of sessions that led to a purchase (also known as the conversion rate).
18.Unit Session Percentage - B2B: The conversion rate specifically for B2B sessions.
19.Ordered Product Sales: The total sales value (in ₹) from the ordered units.
20.Ordered Product Sales - B2B: The total sales value (in ₹) from units ordered by B2B customers.
21.Total Order Items: The total number of ordered items (this might differ from units if customers order multiple quantities in one purchase).
22.Total Order Items - B2B: The total number of ordered items from B2B customers.
df['Ordered Product Sales']= df['Ordered Product Sales'].replace({'₹':'',',':''},regex=True)
df['Ordered Product Sales']= df['Ordered Product Sales'].astype('float')
df['Ordered Product Sales - B2B']= df['Ordered Product Sales - B2B'].replace({'₹':'',',':''},regex=True)
df['Ordered Product Sales - B2B']= df['Ordered Product Sales - B2B'].astype('float')
# Here i convert Ordered Product Sales - B2B and Ordered Product Sales into int as they was in object category but logically they should be in int dtypes
df.dtypes
(Parent) ASIN object (Child) ASIN object Title object SKU object Sessions - Total int64 Sessions - Total - B2B int64 Session Percentage - Total float64 Session Percentage - Total - B2B float64 Page Views - Total int64 Page Views - Total - B2B int64 Page Views Percentage - Total float64 Page Views Percentage - Total - B2B float64 Featured Offer (Buy Box) Percentage float64 Featured Offer (Buy Box) Percentage - B2B float64 Units Ordered int64 Units Ordered - B2B int64 Unit Session Percentage float64 Unit Session Percentage - B2B float64 Ordered Product Sales float64 Ordered Product Sales - B2B float64 Total Order Items int64 Total Order Items - B2B int64 dtype: object
- Data Analysis:
Question 1: Identify top-performing products based on Units Ordered and Ordered Product Sales.
pivot_table = pd.pivot_table(df,
values=['Units Ordered', 'Ordered Product Sales'],
index='SKU',
aggfunc='sum')
pivot_table_sorted= pivot_table.sort_values(by='Units Ordered', ascending=False)
pivot_table_sorted.head(10)
| Ordered Product Sales | Units Ordered | |
|---|---|---|
| SKU | ||
| SD_CPBE_XL | 768432.13 | 994 |
| SD_CPWH_XL | 708748.65 | 923 |
| SD_CPBE_L | 635619.40 | 826 |
| SD_CPWH_L | 571969.78 | 745 |
| SD_CPBL_L | 540178.22 | 709 |
| SD_CPBL_XL | 487344.31 | 616 |
| SD_CPBL_XXL | 468696.58 | 615 |
| SD_CPWH_XXL | 438903.69 | 587 |
| SD_CPBE_M | 370750.62 | 483 |
| SD_CPWH_XXXL | 356232.23 | 457 |
We can see the top 10 performing product and Based on the analysis of units ordered, we can see that SD_CPBE_XL (SAADAA Women Cotton Pant Stretchable Regular Slim Fit Trouser, Beige, XL) ranks as the top-performing product with a total of 994 units ordered. This highlights its strong demand compared to other variants in the same category.
Question:2 Identify trends in session performance (e.g., total sessions vs. B2B sessions).
pivot_df = pd.pivot_table(df, values=['Sessions - Total', 'Sessions - Total - B2B'],index=['SKU'], aggfunc=sum)
pivot_df_sort= pivot_df.sort_values(by='Sessions - Total', ascending=False)
pivot_df_sort.head(10)
C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\662703615.py:1: FutureWarning: The provided callable <built-in function sum> is currently using DataFrameGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead. pivot_df = pd.pivot_table(df, values=['Sessions - Total', 'Sessions - Total - B2B'],index=['SKU'], aggfunc=sum)
| Sessions - Total | Sessions - Total - B2B | |
|---|---|---|
| SKU | ||
| SD_CPBE_XL | 50911 | 434 |
| SD_CPBL_XL | 28416 | 314 |
| SD_CPGR_4XL | 26078 | 299 |
| SD_CPBE_L | 25855 | 342 |
| SD_CPBE_M | 23364 | 259 |
| SD_CPWH_XL | 18738 | 263 |
| SD_CPWH_L | 17877 | 238 |
| SD_CPBE_S | 15799 | 177 |
| SD_CPWH_XXXL | 15648 | 131 |
| SDFSKEC_L | 14214 | 150 |
From the session analysis, SD_CPBE_XL (Beige, XL) leads with the highest total sessions at 50,911 and 434 B2B sessions, indicating significant customer
interest and engagement. The next top performers are SD_CPBL_XL (Black, XL) with 28,416 total sessions and 314 B2B sessions, followed by SD_CPGR_4XL
(Grey, 4XL) with 26,078 total sessions and 299 B2B sessions. These products show a strong presence in both B2C and B2B segments, with SD_CPBE_XL
clearly outperforming others in attracting customer traffic.
Question3: Provide insights on how customer engagement (e.g., sessions, page views) correlates with units ordered and sales.
# Calculate correlations
correlation_matrix = df[['Sessions - Total', 'Page Views - Total', 'Units Ordered', 'Ordered Product Sales']].corr()
print("Correlation Matrix:")
print(correlation_matrix)
# Visualize correlation with scatter plots and regression line
plt.figure(figsize=(12, 8))
# Plot for Sessions vs Units Ordered
plt.subplot(2, 2, 1)
sns.regplot(x='Sessions - Total', y='Units Ordered', data=df)
plt.title('Sessions vs Units Ordered')
# Plot for Sessions vs Ordered Product Sales
plt.subplot(2, 2, 2)
sns.regplot(x='Sessions - Total', y='Ordered Product Sales', data=df)
plt.title('Sessions vs Ordered Product Sales')
# Plot for Page Views vs Units Ordered
plt.subplot(2, 2, 3)
sns.regplot(x='Page Views - Total', y='Units Ordered', data=df)
plt.title('Page Views vs Units Ordered')
# Plot for Page Views vs Ordered Product Sales
plt.subplot(2, 2, 4)
sns.regplot(x='Page Views - Total', y='Ordered Product Sales', data=df)
plt.title('Page Views vs Ordered Product Sales')
plt.tight_layout()
plt.show()
Correlation Matrix:
Sessions - Total Page Views - Total Units Ordered \
Sessions - Total 1.000000 0.999028 0.844451
Page Views - Total 0.999028 1.000000 0.843751
Units Ordered 0.844451 0.843751 1.000000
Ordered Product Sales 0.846511 0.846120 0.998619
Ordered Product Sales
Sessions - Total 0.846511
Page Views - Total 0.846120
Units Ordered 0.998619
Ordered Product Sales 1.000000
1.Sessions - Total and Page Views - Total show a near-perfect correlation of 0.999, indicating that as sessions increase, page views also rise
proportionally.
2.Units Ordered has a strong positive correlation with Sessions - Total (0.844) and Page Views - Total (0.844), suggesting that increased traffic tends to
lead to higher product orders.
3.The correlation between Units Ordered and Ordered Product Sales is exceptionally high at 0.999, confirming that as units ordered increase, product sales
increase almost in perfect sync.
4.Sessions - Total and Ordered Product Sales also have a significant correlation of 0.847, showing that higher sessions tend to result in higher sales,
reinforcing the importance of driving traffic to product pages.
- Data Visualization:
Question.Provide at least 3 key visualizations that could be used to present the data to stakeholders. Examples include: 1.A bar graph comparing units ordered across different products. 2.A line graph showing session trends over time. 3.A scatter plot showing the relationship between customer sessions and ordered product sales.
# 1. Bar Graph: Comparing Units Ordered Across Products
plt.figure(figsize=(50, 50))
sns.barplot(x='SKU', y='Units Ordered', data=df, palette='Blues_d')
plt.title('Units Ordered Across Different Products')
plt.xlabel('SKU')
plt.ylabel('Units Ordered')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()
C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\2272294183.py:3: FutureWarning: Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `x` variable to `hue` and set `legend=False` for the same effect. sns.barplot(x='SKU', y='Units Ordered', data=df, palette='Blues_d')
### HERE every product is unique in the given data, so to check the units ordered wrt to products we can make raw data and order by the units ordered desc.
# 2. Line Graph: Session Trends Over Products (Can be replaced with time data)
plt.figure(figsize=(10, 6))
sns.lineplot(x='SKU', y='Sessions - Total', data=df, marker='o', color='green')
plt.title('Session Trends Over Products')
plt.xlabel('SKU')
plt.ylabel('Sessions - Total')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()
# 3. Scatter Plot: Relationship Between Sessions and Ordered Product Sales
plt.figure(figsize=(10, 6))
sns.scatterplot(x='Sessions - Total', y='Ordered Product Sales', data=df, hue='SKU', palette='viridis', s=100)
plt.title('Relationship Between Sessions and Ordered Product Sales')
plt.xlabel('Sessions - Total')
plt.ylabel('Ordered Product Sales')
plt.tight_layout()
plt.show()
C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\3659534676.py:7: UserWarning: Tight layout not applied. The bottom and top margins cannot be made large enough to accommodate all Axes decorations. plt.tight_layout()
4: Insights and Recommendations: Summarize your findings based on the data and visualizations.
## Measure of Central Tendency
df_numeric= df.select_dtypes(include=['float','int'])
df_numeric
| Sessions - Total | Sessions - Total - B2B | Session Percentage - Total | Session Percentage - Total - B2B | Page Views - Total | Page Views - Total - B2B | Page Views Percentage - Total | Page Views Percentage - Total - B2B | Featured Offer (Buy Box) Percentage | Featured Offer (Buy Box) Percentage - B2B | Units Ordered | Units Ordered - B2B | Unit Session Percentage | Unit Session Percentage - B2B | Ordered Product Sales | Ordered Product Sales - B2B | Total Order Items | Total Order Items - B2B | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 50911 | 434 | 0.0388 | 0.0275 | 64844 | 567 | 0.0401 | 0.0292 | 0.9963 | 0.9935 | 994 | 16 | 0.0195 | 0.0369 | 768432.13 | 12784.00 | 990 | 15 |
| 1 | 18738 | 263 | 0.0143 | 0.0167 | 24568 | 341 | 0.0152 | 0.0176 | 0.9954 | 0.9945 | 923 | 9 | 0.0493 | 0.0342 | 708748.65 | 7147.24 | 918 | 9 |
| 2 | 25855 | 342 | 0.0197 | 0.0217 | 35132 | 488 | 0.0217 | 0.0252 | 0.9950 | 0.9960 | 826 | 12 | 0.0319 | 0.0351 | 635619.40 | 9508.05 | 822 | 12 |
| 3 | 17877 | 238 | 0.0136 | 0.0151 | 23382 | 330 | 0.0145 | 0.0170 | 0.9949 | 0.9932 | 745 | 9 | 0.0417 | 0.0378 | 571969.78 | 7151.05 | 742 | 9 |
| 4 | 11689 | 170 | 0.0089 | 0.0108 | 14801 | 211 | 0.0092 | 0.0109 | 0.9983 | 1.0000 | 709 | 8 | 0.0607 | 0.0471 | 540178.22 | 6392.00 | 700 | 8 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 494 | 255 | 1 | 0.0002 | 0.0001 | 286 | 2 | 0.0002 | 0.0001 | 1.0000 | 0.0000 | 7 | 0 | 0.0275 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 495 | 154 | 1 | 0.0001 | 0.0001 | 191 | 1 | 0.0001 | 0.0001 | 0.9944 | 1.0000 | 7 | 0 | 0.0455 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 496 | 171 | 1 | 0.0001 | 0.0001 | 197 | 1 | 0.0001 | 0.0001 | 0.9946 | 0.0000 | 7 | 0 | 0.0409 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 497 | 484 | 7 | 0.0004 | 0.0004 | 600 | 7 | 0.0004 | 0.0004 | 0.9964 | 1.0000 | 7 | 0 | 0.0145 | 0.0000 | 5589.20 | 0.00 | 7 | 0 |
| 498 | 270 | 6 | 0.0002 | 0.0004 | 312 | 7 | 0.0002 | 0.0004 | 1.0000 | 1.0000 | 7 | 0 | 0.0259 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
499 rows × 18 columns
df_numeric.mean()
Sessions - Total 2300.236473 Sessions - Total - B2B 27.280561 Session Percentage - Total 0.001750 Session Percentage - Total - B2B 0.001730 Page Views - Total 2837.214429 Page Views - Total - B2B 33.651303 Page Views Percentage - Total 0.001756 Page Views Percentage - Total - B2B 0.001741 Featured Offer (Buy Box) Percentage 0.995885 Featured Offer (Buy Box) Percentage - B2B 0.966178 Units Ordered 52.773547 Units Ordered - B2B 0.549098 Unit Session Percentage 0.022916 Unit Session Percentage - B2B 0.016596 Ordered Product Sales 42813.839920 Ordered Product Sales - B2B 457.116473 Total Order Items 52.573146 Total Order Items - B2B 0.547094 dtype: float64
1.Here we can see that sesstion total mean is 2300.23 and mean for session B2B is 27.28. For session total percentage and session percentage total b2b have the same mean. 2.Average product sales is Rs. 42813 and for the ordered product sales B2B is Rs. 457.11.
df_cat= df.select_dtypes(include=object)
df_cat.describe()
| (Parent) ASIN | (Child) ASIN | Title | SKU | |
|---|---|---|---|---|
| count | 499 | 499 | 499 | 499 |
| unique | 499 | 499 | 499 | 499 |
| top | B09M3HDJV5 | B09M3HDJV5 | SAADAA Women Cotton Pant Stretchable Regular S... | SD_CPBE_XL |
| freq | 1 | 1 | 1 | 1 |
The dataset contains 499 unique products, with no duplicates across Parent ASINs, Child ASINs, Titles, or SKUs. The most frequent product is SD_CPBE_XL (SAADAA Women Cotton Pant Stretchable Regular Slim Fit Trouser, Beige, XL), but it appears only once, indicating each product in the dataset is distinct. Since the categorical variables are unique, they do not provide additional useful information for analysis, as there's no variability or repetition to draw meaningful insights from these fields.
sns.heatmap(df_numeric.isnull(), cbar=False, cmap="viridis")
<Axes: >
The dataset shows no missing values, which indicates that the data is complete and does not require any imputation. This ensures that we can proceed with analysis without concerns about missing data impacting the results.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
def automated_descriptive_analysis(df_num):
"""
Automates the descriptive analysis of a given DataFrame.
Includes summary statistics, missing values, correlation matrix, and visualizations.
"""
print("Basic Information:")
print(df_num.info()) # Provides an overview of column types and non-null counts
print("\n")
print("Summary Statistics:")
print(df_num.describe(include='all')) # Provides summary statistics for both numerical and categorical variables
print("\n")
print("Missing Values:")
print(df_num.isnull().sum()) # Shows the number of missing values per column
print("\n")
print("Correlation Matrix (for numeric columns):")
corr_matrix = df_num.corr() # Correlation matrix for numeric columns
print(corr_matrix)
print("\n")
# Visualization of correlation matrix using a heatmap
plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title("Correlation Heatmap")
plt.show()
# Visualize missing values (heatmap)
plt.figure(figsize=(10, 6))
sns.heatmap(df_num.isnull(), cbar=False, cmap="viridis")
plt.title("Missing Values Heatmap")
plt.show()
# Univariate distribution for numerical columns
num_cols = df_num.select_dtypes(include=[np.number]).columns
for col in num_cols:
plt.figure(figsize=(6, 4))
sns.histplot(df[col], kde=True)
plt.title(f'Distribution of {col}')
plt.xlabel(col)
plt.ylabel('Frequency')
plt.show()
# Categorical column analysis
cat_cols = df_num.select_dtypes(include=[object]).columns
for col in cat_cols:
plt.figure(figsize=(6, 4))
sns.countplot(data=df_num, x=col)
plt.title(f'Count Plot of {col}')
plt.xlabel(col)
plt.ylabel('Count')
plt.xticks(rotation=45)
plt.show()
# Pairplot for numeric columns (optional for small datasets)
if len(num_cols) <= 5:
sns.pairplot(df[num_cols])
plt.title("Pairplot for Numeric Columns")
plt.show()
# Sample usage
# Load your data (replace with your dataset path or DataFrame)
df_num = df_numeric
# Call the function to perform automated descriptive analysis
automated_descriptive_analysis(df_num)
Basic Information:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 499 entries, 0 to 498
Data columns (total 18 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Sessions - Total 499 non-null int64
1 Sessions - Total - B2B 499 non-null int64
2 Session Percentage - Total 499 non-null float64
3 Session Percentage - Total - B2B 499 non-null float64
4 Page Views - Total 499 non-null int64
5 Page Views - Total - B2B 499 non-null int64
6 Page Views Percentage - Total 499 non-null float64
7 Page Views Percentage - Total - B2B 499 non-null float64
8 Featured Offer (Buy Box) Percentage 499 non-null float64
9 Featured Offer (Buy Box) Percentage - B2B 499 non-null float64
10 Units Ordered 499 non-null int64
11 Units Ordered - B2B 499 non-null int64
12 Unit Session Percentage 499 non-null float64
13 Unit Session Percentage - B2B 499 non-null float64
14 Ordered Product Sales 499 non-null float64
15 Ordered Product Sales - B2B 499 non-null float64
16 Total Order Items 499 non-null int64
17 Total Order Items - B2B 499 non-null int64
dtypes: float64(10), int64(8)
memory usage: 70.3 KB
None
Summary Statistics:
Sessions - Total Sessions - Total - B2B Session Percentage - Total \
count 499.000000 499.000000 499.000000
mean 2300.236473 27.280561 0.001750
std 3988.527400 43.852681 0.003038
min 140.000000 0.000000 0.000100
25% 612.000000 7.000000 0.000500
50% 1099.000000 15.000000 0.000800
75% 2310.500000 28.000000 0.001800
max 50911.000000 434.000000 0.038800
Session Percentage - Total - B2B Page Views - Total \
count 499.000000 499.000000
mean 0.001730 2837.214429
std 0.002781 5141.492476
min 0.000000 174.000000
25% 0.000400 724.500000
50% 0.001000 1326.000000
75% 0.001800 2780.500000
max 0.027500 64844.000000
Page Views - Total - B2B Page Views Percentage - Total \
count 499.000000 499.000000
mean 33.651303 0.001756
std 57.429323 0.003183
min 0.000000 0.000100
25% 8.000000 0.000400
50% 17.000000 0.000800
75% 34.000000 0.001700
max 567.000000 0.040100
Page Views Percentage - Total - B2B \
count 499.000000
mean 0.001741
std 0.002958
min 0.000000
25% 0.000400
50% 0.000900
75% 0.001800
max 0.029200
Featured Offer (Buy Box) Percentage \
count 499.000000
mean 0.995885
std 0.008037
min 0.834200
25% 0.995100
50% 0.996800
75% 0.998150
max 1.000000
Featured Offer (Buy Box) Percentage - B2B Units Ordered \
count 499.000000 499.000000
mean 0.966178 52.773547
std 0.166665 110.745062
min 0.000000 7.000000
25% 1.000000 11.000000
50% 1.000000 18.000000
75% 1.000000 43.000000
max 1.000000 994.000000
Units Ordered - B2B Unit Session Percentage \
count 499.000000 499.000000
mean 0.549098 0.022916
std 1.446188 0.012560
min 0.000000 0.001600
25% 0.000000 0.013550
50% 0.000000 0.020500
75% 1.000000 0.029450
max 16.000000 0.075400
Unit Session Percentage - B2B Ordered Product Sales \
count 499.000000 499.000000
mean 0.016596 42813.839920
std 0.038312 84543.946631
min 0.000000 4794.000000
25% 0.000000 10791.000000
50% 0.000000 17991.000000
75% 0.015500 35102.690000
max 0.333300 768432.130000
Ordered Product Sales - B2B Total Order Items Total Order Items - B2B
count 499.000000 499.000000 499.000000
mean 457.116473 52.573146 0.547094
std 1164.402321 110.181450 1.425276
min 0.000000 7.000000 0.000000
25% 0.000000 11.000000 0.000000
50% 0.000000 18.000000 0.000000
75% 799.000000 43.000000 1.000000
max 12784.000000 990.000000 15.000000
Missing Values:
Sessions - Total 0
Sessions - Total - B2B 0
Session Percentage - Total 0
Session Percentage - Total - B2B 0
Page Views - Total 0
Page Views - Total - B2B 0
Page Views Percentage - Total 0
Page Views Percentage - Total - B2B 0
Featured Offer (Buy Box) Percentage 0
Featured Offer (Buy Box) Percentage - B2B 0
Units Ordered 0
Units Ordered - B2B 0
Unit Session Percentage 0
Unit Session Percentage - B2B 0
Ordered Product Sales 0
Ordered Product Sales - B2B 0
Total Order Items 0
Total Order Items - B2B 0
dtype: int64
Correlation Matrix (for numeric columns):
Sessions - Total \
Sessions - Total 1.000000
Sessions - Total - B2B 0.968076
Session Percentage - Total 0.999952
Session Percentage - Total - B2B 0.967796
Page Views - Total 0.999028
Page Views - Total - B2B 0.963378
Page Views Percentage - Total 0.998989
Page Views Percentage - Total - B2B 0.963281
Featured Offer (Buy Box) Percentage 0.002252
Featured Offer (Buy Box) Percentage - B2B 0.085781
Units Ordered 0.844451
Units Ordered - B2B 0.726260
Unit Session Percentage 0.001252
Unit Session Percentage - B2B 0.037124
Ordered Product Sales 0.846511
Ordered Product Sales - B2B 0.715047
Total Order Items 0.844701
Total Order Items - B2B 0.719745
Sessions - Total - B2B \
Sessions - Total 0.968076
Sessions - Total - B2B 1.000000
Session Percentage - Total 0.968039
Session Percentage - Total - B2B 0.999949
Page Views - Total 0.970659
Page Views - Total - B2B 0.996586
Page Views Percentage - Total 0.970751
Page Views Percentage - Total - B2B 0.996498
Featured Offer (Buy Box) Percentage 0.005413
Featured Offer (Buy Box) Percentage - B2B 0.098951
Units Ordered 0.884008
Units Ordered - B2B 0.761843
Unit Session Percentage 0.049353
Unit Session Percentage - B2B 0.057484
Ordered Product Sales 0.884140
Ordered Product Sales - B2B 0.750091
Total Order Items 0.884252
Total Order Items - B2B 0.759954
Session Percentage - Total \
Sessions - Total 0.999952
Sessions - Total - B2B 0.968039
Session Percentage - Total 1.000000
Session Percentage - Total - B2B 0.967755
Page Views - Total 0.998980
Page Views - Total - B2B 0.963328
Page Views Percentage - Total 0.998943
Page Views Percentage - Total - B2B 0.963233
Featured Offer (Buy Box) Percentage 0.001914
Featured Offer (Buy Box) Percentage - B2B 0.085863
Units Ordered 0.844426
Units Ordered - B2B 0.726482
Unit Session Percentage 0.001548
Unit Session Percentage - B2B 0.037172
Ordered Product Sales 0.846430
Ordered Product Sales - B2B 0.715237
Total Order Items 0.844677
Total Order Items - B2B 0.719959
Session Percentage - Total - B2B \
Sessions - Total 0.967796
Sessions - Total - B2B 0.999949
Session Percentage - Total 0.967755
Session Percentage - Total - B2B 1.000000
Page Views - Total 0.970415
Page Views - Total - B2B 0.996565
Page Views Percentage - Total 0.970507
Page Views Percentage - Total - B2B 0.996484
Featured Offer (Buy Box) Percentage 0.005648
Featured Offer (Buy Box) Percentage - B2B 0.098876
Units Ordered 0.883984
Units Ordered - B2B 0.761985
Unit Session Percentage 0.049823
Unit Session Percentage - B2B 0.057602
Ordered Product Sales 0.884149
Ordered Product Sales - B2B 0.750264
Total Order Items 0.884226
Total Order Items - B2B 0.760112
Page Views - Total \
Sessions - Total 0.999028
Sessions - Total - B2B 0.970659
Session Percentage - Total 0.998980
Session Percentage - Total - B2B 0.970415
Page Views - Total 1.000000
Page Views - Total - B2B 0.968083
Page Views Percentage - Total 0.999955
Page Views Percentage - Total - B2B 0.968034
Featured Offer (Buy Box) Percentage -0.000902
Featured Offer (Buy Box) Percentage - B2B 0.082218
Units Ordered 0.843751
Units Ordered - B2B 0.724666
Unit Session Percentage 0.003633
Unit Session Percentage - B2B 0.035363
Ordered Product Sales 0.846120
Ordered Product Sales - B2B 0.713661
Total Order Items 0.843984
Total Order Items - B2B 0.718307
Page Views - Total - B2B \
Sessions - Total 0.963378
Sessions - Total - B2B 0.996586
Session Percentage - Total 0.963328
Session Percentage - Total - B2B 0.996565
Page Views - Total 0.968083
Page Views - Total - B2B 1.000000
Page Views Percentage - Total 0.968117
Page Views Percentage - Total - B2B 0.999953
Featured Offer (Buy Box) Percentage 0.003566
Featured Offer (Buy Box) Percentage - B2B 0.094035
Units Ordered 0.880059
Units Ordered - B2B 0.767457
Unit Session Percentage 0.052661
Unit Session Percentage - B2B 0.060744
Ordered Product Sales 0.880517
Ordered Product Sales - B2B 0.755919
Total Order Items 0.880295
Total Order Items - B2B 0.765633
Page Views Percentage - Total \
Sessions - Total 0.998989
Sessions - Total - B2B 0.970751
Session Percentage - Total 0.998943
Session Percentage - Total - B2B 0.970507
Page Views - Total 0.999955
Page Views - Total - B2B 0.968117
Page Views Percentage - Total 1.000000
Page Views Percentage - Total - B2B 0.968063
Featured Offer (Buy Box) Percentage -0.001365
Featured Offer (Buy Box) Percentage - B2B 0.082279
Units Ordered 0.843881
Units Ordered - B2B 0.724250
Unit Session Percentage 0.004026
Unit Session Percentage - B2B 0.034963
Ordered Product Sales 0.846255
Ordered Product Sales - B2B 0.713248
Total Order Items 0.844113
Total Order Items - B2B 0.717904
Page Views Percentage - Total - B2B \
Sessions - Total 0.963281
Sessions - Total - B2B 0.996498
Session Percentage - Total 0.963233
Session Percentage - Total - B2B 0.996484
Page Views - Total 0.968034
Page Views - Total - B2B 0.999953
Page Views Percentage - Total 0.968063
Page Views Percentage - Total - B2B 1.000000
Featured Offer (Buy Box) Percentage 0.002729
Featured Offer (Buy Box) Percentage - B2B 0.093824
Units Ordered 0.880385
Units Ordered - B2B 0.767552
Unit Session Percentage 0.053814
Unit Session Percentage - B2B 0.060388
Ordered Product Sales 0.880843
Ordered Product Sales - B2B 0.755989
Total Order Items 0.880622
Total Order Items - B2B 0.765736
Featured Offer (Buy Box) Percentage \
Sessions - Total 0.002252
Sessions - Total - B2B 0.005413
Session Percentage - Total 0.001914
Session Percentage - Total - B2B 0.005648
Page Views - Total -0.000902
Page Views - Total - B2B 0.003566
Page Views Percentage - Total -0.001365
Page Views Percentage - Total - B2B 0.002729
Featured Offer (Buy Box) Percentage 1.000000
Featured Offer (Buy Box) Percentage - B2B 0.032717
Units Ordered 0.003260
Units Ordered - B2B 0.010379
Unit Session Percentage -0.003832
Unit Session Percentage - B2B -0.007864
Ordered Product Sales 0.003231
Ordered Product Sales - B2B 0.009369
Total Order Items 0.003224
Total Order Items - B2B 0.010458
Featured Offer (Buy Box) Percentage - B2B \
Sessions - Total 0.085781
Sessions - Total - B2B 0.098951
Session Percentage - Total 0.085863
Session Percentage - Total - B2B 0.098876
Page Views - Total 0.082218
Page Views - Total - B2B 0.094035
Page Views Percentage - Total 0.082279
Page Views Percentage - Total - B2B 0.093824
Featured Offer (Buy Box) Percentage 0.032717
Featured Offer (Buy Box) Percentage - B2B 1.000000
Units Ordered 0.064148
Units Ordered - B2B 0.062024
Unit Session Percentage -0.224982
Unit Session Percentage - B2B 0.075349
Ordered Product Sales 0.063807
Ordered Product Sales - B2B 0.064560
Total Order Items 0.064162
Total Order Items - B2B 0.062703
Units Ordered Units Ordered - B2B \
Sessions - Total 0.844451 0.726260
Sessions - Total - B2B 0.884008 0.761843
Session Percentage - Total 0.844426 0.726482
Session Percentage - Total - B2B 0.883984 0.761985
Page Views - Total 0.843751 0.724666
Page Views - Total - B2B 0.880059 0.767457
Page Views Percentage - Total 0.843881 0.724250
Page Views Percentage - Total - B2B 0.880385 0.767552
Featured Offer (Buy Box) Percentage 0.003260 0.010379
Featured Offer (Buy Box) Percentage - B2B 0.064148 0.062024
Units Ordered 1.000000 0.831508
Units Ordered - B2B 0.831508 1.000000
Unit Session Percentage 0.295612 0.233248
Unit Session Percentage - B2B 0.096341 0.396143
Ordered Product Sales 0.998619 0.829682
Ordered Product Sales - B2B 0.817496 0.995871
Total Order Items 0.999990 0.831778
Total Order Items - B2B 0.831734 0.999620
Unit Session Percentage \
Sessions - Total 0.001252
Sessions - Total - B2B 0.049353
Session Percentage - Total 0.001548
Session Percentage - Total - B2B 0.049823
Page Views - Total 0.003633
Page Views - Total - B2B 0.052661
Page Views Percentage - Total 0.004026
Page Views Percentage - Total - B2B 0.053814
Featured Offer (Buy Box) Percentage -0.003832
Featured Offer (Buy Box) Percentage - B2B -0.224982
Units Ordered 0.295612
Units Ordered - B2B 0.233248
Unit Session Percentage 1.000000
Unit Session Percentage - B2B 0.150584
Ordered Product Sales 0.290694
Ordered Product Sales - B2B 0.222106
Total Order Items 0.295753
Total Order Items - B2B 0.237053
Unit Session Percentage - B2B \
Sessions - Total 0.037124
Sessions - Total - B2B 0.057484
Session Percentage - Total 0.037172
Session Percentage - Total - B2B 0.057602
Page Views - Total 0.035363
Page Views - Total - B2B 0.060744
Page Views Percentage - Total 0.034963
Page Views Percentage - Total - B2B 0.060388
Featured Offer (Buy Box) Percentage -0.007864
Featured Offer (Buy Box) Percentage - B2B 0.075349
Units Ordered 0.096341
Units Ordered - B2B 0.396143
Unit Session Percentage 0.150584
Unit Session Percentage - B2B 1.000000
Ordered Product Sales 0.091777
Ordered Product Sales - B2B 0.425255
Total Order Items 0.096569
Total Order Items - B2B 0.401209
Ordered Product Sales \
Sessions - Total 0.846511
Sessions - Total - B2B 0.884140
Session Percentage - Total 0.846430
Session Percentage - Total - B2B 0.884149
Page Views - Total 0.846120
Page Views - Total - B2B 0.880517
Page Views Percentage - Total 0.846255
Page Views Percentage - Total - B2B 0.880843
Featured Offer (Buy Box) Percentage 0.003231
Featured Offer (Buy Box) Percentage - B2B 0.063807
Units Ordered 0.998619
Units Ordered - B2B 0.829682
Unit Session Percentage 0.290694
Unit Session Percentage - B2B 0.091777
Ordered Product Sales 1.000000
Ordered Product Sales - B2B 0.817007
Total Order Items 0.998596
Total Order Items - B2B 0.829764
Ordered Product Sales - B2B \
Sessions - Total 0.715047
Sessions - Total - B2B 0.750091
Session Percentage - Total 0.715237
Session Percentage - Total - B2B 0.750264
Page Views - Total 0.713661
Page Views - Total - B2B 0.755919
Page Views Percentage - Total 0.713248
Page Views Percentage - Total - B2B 0.755989
Featured Offer (Buy Box) Percentage 0.009369
Featured Offer (Buy Box) Percentage - B2B 0.064560
Units Ordered 0.817496
Units Ordered - B2B 0.995871
Unit Session Percentage 0.222106
Unit Session Percentage - B2B 0.425255
Ordered Product Sales 0.817007
Ordered Product Sales - B2B 1.000000
Total Order Items 0.817763
Total Order Items - B2B 0.995568
Total Order Items \
Sessions - Total 0.844701
Sessions - Total - B2B 0.884252
Session Percentage - Total 0.844677
Session Percentage - Total - B2B 0.884226
Page Views - Total 0.843984
Page Views - Total - B2B 0.880295
Page Views Percentage - Total 0.844113
Page Views Percentage - Total - B2B 0.880622
Featured Offer (Buy Box) Percentage 0.003224
Featured Offer (Buy Box) Percentage - B2B 0.064162
Units Ordered 0.999990
Units Ordered - B2B 0.831778
Unit Session Percentage 0.295753
Unit Session Percentage - B2B 0.096569
Ordered Product Sales 0.998596
Ordered Product Sales - B2B 0.817763
Total Order Items 1.000000
Total Order Items - B2B 0.831995
Total Order Items - B2B
Sessions - Total 0.719745
Sessions - Total - B2B 0.759954
Session Percentage - Total 0.719959
Session Percentage - Total - B2B 0.760112
Page Views - Total 0.718307
Page Views - Total - B2B 0.765633
Page Views Percentage - Total 0.717904
Page Views Percentage - Total - B2B 0.765736
Featured Offer (Buy Box) Percentage 0.010458
Featured Offer (Buy Box) Percentage - B2B 0.062703
Units Ordered 0.831734
Units Ordered - B2B 0.999620
Unit Session Percentage 0.237053
Unit Session Percentage - B2B 0.401209
Ordered Product Sales 0.829764
Ordered Product Sales - B2B 0.995568
Total Order Items 0.831995
Total Order Items - B2B 1.000000
df_numeric.std()
Sessions - Total 3988.527400 Sessions - Total - B2B 43.852681 Session Percentage - Total 0.003038 Session Percentage - Total - B2B 0.002781 Page Views - Total 5141.492476 Page Views - Total - B2B 57.429323 Page Views Percentage - Total 0.003183 Page Views Percentage - Total - B2B 0.002958 Featured Offer (Buy Box) Percentage 0.008037 Featured Offer (Buy Box) Percentage - B2B 0.166665 Units Ordered 110.745062 Units Ordered - B2B 1.446188 Unit Session Percentage 0.012560 Unit Session Percentage - B2B 0.038312 Ordered Product Sales 84543.946631 Ordered Product Sales - B2B 1164.402321 Total Order Items 110.181450 Total Order Items - B2B 1.425276 dtype: float64
Although standard deviation is commonly used to assess data dispersion, I am not using it here due to its unit dependency, which could complicate interpretation across different metrics
plt.figure(figsize=(10, 8))
sns.heatmap(df_numeric.corr(), annot=True, cmap='coolwarm', fmt=".2f")
plt.title("Correlation Heatmap")
plt.show()
1.I have chosen to use a correlation heatmap as a visualization technique to assess the variability in the data. The advantage of correlation is that it measures relationships between features independently of their units, making it an effective tool for identifying key features. Based on the heatmap, I have observed the following points:
2.Highly Correlated Features: The heatmap reveals that features with a higher percentage and darker color are more strongly correlated with each other, indicating strong relationships.
3.Redundant Features: Features like Session - Total and Session Percentage - Total, as well as Session - Total - B2B and Session Percentage - Total - B2B, show identical correlation patterns with other features. Similarly, Page Views - Total and Page Views Percentage - Total, along with their B2B counterparts, have identical correlations. Hence, I will focus on the percentage-based features to streamline the analysis.
4.Low Correlation with Buy Box: Featured Offer (Buy Box) Percentage and Featured Offer (Buy Box) Percentage - B2B show weak correlations (1% and 6% respectively) with Total Ordered Items and B2B Ordered Items, indicating they may not be key drivers of order volume.
5.Units Session Percentage Impact: Units Session Percentage and Units Session Percentage - B2B show a moderate correlation (30%) with Total Ordered Items, while Units Session Percentage - B2B has a stronger correlation (40%) with Total Ordered Items - B2B. These features will be useful in tracking their impact on business performance.
6.Target Variables: I have selected Total Ordered Items and Total Ordered Items - B2B as target variables. They are directly proportional to Ordered Product Sales and Ordered Product Sales - B2B, and they represent subsets of Units Ordered and Units Ordered - B2B, making them essential metrics for tracking overall and B2B sales.
df_num_up= df_numeric
df_num_up
| Sessions - Total | Sessions - Total - B2B | Session Percentage - Total | Session Percentage - Total - B2B | Page Views - Total | Page Views - Total - B2B | Page Views Percentage - Total | Page Views Percentage - Total - B2B | Featured Offer (Buy Box) Percentage | Featured Offer (Buy Box) Percentage - B2B | Units Ordered | Units Ordered - B2B | Unit Session Percentage | Unit Session Percentage - B2B | Ordered Product Sales | Ordered Product Sales - B2B | Total Order Items | Total Order Items - B2B | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 50911 | 434 | 0.0388 | 0.0275 | 64844 | 567 | 0.0401 | 0.0292 | 0.9963 | 0.9935 | 994 | 16 | 0.0195 | 0.0369 | 768432.13 | 12784.00 | 990 | 15 |
| 1 | 18738 | 263 | 0.0143 | 0.0167 | 24568 | 341 | 0.0152 | 0.0176 | 0.9954 | 0.9945 | 923 | 9 | 0.0493 | 0.0342 | 708748.65 | 7147.24 | 918 | 9 |
| 2 | 25855 | 342 | 0.0197 | 0.0217 | 35132 | 488 | 0.0217 | 0.0252 | 0.9950 | 0.9960 | 826 | 12 | 0.0319 | 0.0351 | 635619.40 | 9508.05 | 822 | 12 |
| 3 | 17877 | 238 | 0.0136 | 0.0151 | 23382 | 330 | 0.0145 | 0.0170 | 0.9949 | 0.9932 | 745 | 9 | 0.0417 | 0.0378 | 571969.78 | 7151.05 | 742 | 9 |
| 4 | 11689 | 170 | 0.0089 | 0.0108 | 14801 | 211 | 0.0092 | 0.0109 | 0.9983 | 1.0000 | 709 | 8 | 0.0607 | 0.0471 | 540178.22 | 6392.00 | 700 | 8 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 494 | 255 | 1 | 0.0002 | 0.0001 | 286 | 2 | 0.0002 | 0.0001 | 1.0000 | 0.0000 | 7 | 0 | 0.0275 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 495 | 154 | 1 | 0.0001 | 0.0001 | 191 | 1 | 0.0001 | 0.0001 | 0.9944 | 1.0000 | 7 | 0 | 0.0455 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 496 | 171 | 1 | 0.0001 | 0.0001 | 197 | 1 | 0.0001 | 0.0001 | 0.9946 | 0.0000 | 7 | 0 | 0.0409 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 497 | 484 | 7 | 0.0004 | 0.0004 | 600 | 7 | 0.0004 | 0.0004 | 0.9964 | 1.0000 | 7 | 0 | 0.0145 | 0.0000 | 5589.20 | 0.00 | 7 | 0 |
| 498 | 270 | 6 | 0.0002 | 0.0004 | 312 | 7 | 0.0002 | 0.0004 | 1.0000 | 1.0000 | 7 | 0 | 0.0259 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
499 rows × 18 columns
df_num_up.drop(["Sessions - Total","Sessions - Total - B2B", "Page Views - Total","Page Views - Total - B2B","Featured Offer (Buy Box) Percentage",
"Featured Offer (Buy Box) Percentage - B2B"]
, axis=1, inplace=True)
df_num_up
| Session Percentage - Total | Session Percentage - Total - B2B | Page Views Percentage - Total | Page Views Percentage - Total - B2B | Units Ordered | Units Ordered - B2B | Unit Session Percentage | Unit Session Percentage - B2B | Ordered Product Sales | Ordered Product Sales - B2B | Total Order Items | Total Order Items - B2B | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0388 | 0.0275 | 0.0401 | 0.0292 | 994 | 16 | 0.0195 | 0.0369 | 768432.13 | 12784.00 | 990 | 15 |
| 1 | 0.0143 | 0.0167 | 0.0152 | 0.0176 | 923 | 9 | 0.0493 | 0.0342 | 708748.65 | 7147.24 | 918 | 9 |
| 2 | 0.0197 | 0.0217 | 0.0217 | 0.0252 | 826 | 12 | 0.0319 | 0.0351 | 635619.40 | 9508.05 | 822 | 12 |
| 3 | 0.0136 | 0.0151 | 0.0145 | 0.0170 | 745 | 9 | 0.0417 | 0.0378 | 571969.78 | 7151.05 | 742 | 9 |
| 4 | 0.0089 | 0.0108 | 0.0092 | 0.0109 | 709 | 8 | 0.0607 | 0.0471 | 540178.22 | 6392.00 | 700 | 8 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 494 | 0.0002 | 0.0001 | 0.0002 | 0.0001 | 7 | 0 | 0.0275 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 495 | 0.0001 | 0.0001 | 0.0001 | 0.0001 | 7 | 0 | 0.0455 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 496 | 0.0001 | 0.0001 | 0.0001 | 0.0001 | 7 | 0 | 0.0409 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 497 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 7 | 0 | 0.0145 | 0.0000 | 5589.20 | 0.00 | 7 | 0 |
| 498 | 0.0002 | 0.0004 | 0.0002 | 0.0004 | 7 | 0 | 0.0259 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
499 rows × 12 columns
plt.rcParams["figure.figsize"] = [5,5]
ax = plt.boxplot(df_num_up["Total Order Items"], vert = False)
plt.xlabel("Total ordered Items", fontsize=20)
# display the plot
plt.show()
plt.rcParams["figure.figsize"] = [5,5]
ax = plt.boxplot(df_num_up["Total Order Items - B2B"], vert = False)
# set the label for x-axis
plt.xlabel("Total ordered Items B2B", fontsize=20)
# display the plot
plt.show()
num_cols = df_num_up.select_dtypes(include=[np.number]).columns
for col in num_cols:
plt.figure(figsize=(6, 4))
sns.histplot(df[col], kde=True)
plt.title(f'Distribution of {col}')
plt.xlabel(col)
plt.ylabel('Frequency')
plt.show()
Upon reviewing the data, it is clear that both target variables, Total Ordered Items and Total Ordered Items - B2B, contain outliers. To handle this, I am dividing the features into buckets. The question arises: should I base this division on mean, mode, or median? However, since the features are not symmetrically distributed and some are positively skewed, I am opting to divide all features based on quantiles. This approach will account for the skewness in the data and provide a more accurate representation across different ranges.
df_num_up
df_num_up1=df_num_up
# I am making a copy of my data frame because i am applying quantiles bucket on percentage columns.
df_num_up1
| Session Percentage - Total | Session Percentage - Total - B2B | Page Views Percentage - Total | Page Views Percentage - Total - B2B | Units Ordered | Units Ordered - B2B | Unit Session Percentage | Unit Session Percentage - B2B | Ordered Product Sales | Ordered Product Sales - B2B | Total Order Items | Total Order Items - B2B | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.0388 | 0.0275 | 0.0401 | 0.0292 | 994 | 16 | 0.0195 | 0.0369 | 768432.13 | 12784.00 | 990 | 15 |
| 1 | 0.0143 | 0.0167 | 0.0152 | 0.0176 | 923 | 9 | 0.0493 | 0.0342 | 708748.65 | 7147.24 | 918 | 9 |
| 2 | 0.0197 | 0.0217 | 0.0217 | 0.0252 | 826 | 12 | 0.0319 | 0.0351 | 635619.40 | 9508.05 | 822 | 12 |
| 3 | 0.0136 | 0.0151 | 0.0145 | 0.0170 | 745 | 9 | 0.0417 | 0.0378 | 571969.78 | 7151.05 | 742 | 9 |
| 4 | 0.0089 | 0.0108 | 0.0092 | 0.0109 | 709 | 8 | 0.0607 | 0.0471 | 540178.22 | 6392.00 | 700 | 8 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 494 | 0.0002 | 0.0001 | 0.0002 | 0.0001 | 7 | 0 | 0.0275 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 495 | 0.0001 | 0.0001 | 0.0001 | 0.0001 | 7 | 0 | 0.0455 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 496 | 0.0001 | 0.0001 | 0.0001 | 0.0001 | 7 | 0 | 0.0409 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
| 497 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 7 | 0 | 0.0145 | 0.0000 | 5589.20 | 0.00 | 7 | 0 |
| 498 | 0.0002 | 0.0004 | 0.0002 | 0.0004 | 7 | 0 | 0.0259 | 0.0000 | 5593.00 | 0.00 | 7 | 0 |
499 rows × 12 columns
import pandas as pd
# List of percentage columns to bucket
percentage_columns = [
'Session Percentage - Total',
'Session Percentage - Total - B2B',
'Page Views Percentage - Total',
'Page Views Percentage - Total - B2B',
'Unit Session Percentage',
'Unit Session Percentage - B2B'
]
# Function to bucket the columns into quantiles (2 or 3 buckets: Low, Medium, High)
def quantile_buckets(df_num_up1, column):
try:
quantiles, bins = pd.qcut(df_num_up1[column], q=3, retbins=True, duplicates='drop')
num_bins = len(bins) - 1 # bins will have one more edge than the number of bins
if num_bins == 3:
labels = ['Low', 'Medium', 'High']
elif num_bins == 2:
labels = ['Low', 'High'] # Only two bins, so label as Low and High
else:
print(f"Column {column} does not have enough unique values for binning.")
return ['Low'] * len(df_num_up1[column]) # Assign 'Low' if not enough unique values
return pd.qcut(df_num_up1[column], q=num_bins, labels=labels, duplicates='drop')
except ValueError as e:
print(f"Error for column {column}: {e}")
return None
for col in percentage_columns:
df_num_up1[col + '_Bucket'] = quantile_buckets(df_num_up1, col)
print(df_num_up1.head())
Column Unit Session Percentage - B2B does not have enough unique values for binning. Session Percentage - Total Session Percentage - Total - B2B \ 0 0.0388 0.0275 1 0.0143 0.0167 2 0.0197 0.0217 3 0.0136 0.0151 4 0.0089 0.0108 Page Views Percentage - Total Page Views Percentage - Total - B2B \ 0 0.0401 0.0292 1 0.0152 0.0176 2 0.0217 0.0252 3 0.0145 0.0170 4 0.0092 0.0109 Units Ordered Units Ordered - B2B Unit Session Percentage \ 0 994 16 0.0195 1 923 9 0.0493 2 826 12 0.0319 3 745 9 0.0417 4 709 8 0.0607 Unit Session Percentage - B2B Ordered Product Sales \ 0 0.0369 768432.13 1 0.0342 708748.65 2 0.0351 635619.40 3 0.0378 571969.78 4 0.0471 540178.22 Ordered Product Sales - B2B Total Order Items Total Order Items - B2B \ 0 12784.00 990 15 1 7147.24 918 9 2 9508.05 822 12 3 7151.05 742 9 4 6392.00 700 8 Session Percentage - Total_Bucket Session Percentage - Total - B2B_Bucket \ 0 High High 1 High High 2 High High 3 High High 4 High High Page Views Percentage - Total_Bucket \ 0 High 1 High 2 High 3 High 4 High Page Views Percentage - Total - B2B_Bucket Unit Session Percentage_Bucket \ 0 High Medium 1 High High 2 High High 3 High High 4 High High Unit Session Percentage - B2B_Bucket 0 Low 1 Low 2 Low 3 Low 4 Low
I have converted the percentage features into high, medium, and low buckets to facilitate a more analysis of Units Ordered. This segmentation allows for a clearer understanding of how different percentage ranges impact the order volumes. By categorizing the data into these buckets, I can now assess the relationship between each bucket and the number of units ordered, providing insights into which percentage levels drive higher order quantities.
# Create pivot tables for each bucket and Units Ordered columns
pivot_total = pd.pivot_table(df_num_up1, values='Units Ordered', index='Session Percentage - Total_Bucket', aggfunc=[ 'sum'])
pivot_b2b = pd.pivot_table(df_num_up1, values='Units Ordered - B2B', index='Session Percentage - Total - B2B_Bucket', aggfunc=[ 'sum'])
# Print pivot tables
print("Pivot Table for Units Ordered by Session Percentage - Total Bucket:")
print(pivot_total)
print("\nPivot Table for Units Ordered - B2B by Session Percentage - Total - B2B Bucket:")
print(pivot_b2b)
Pivot Table for Units Ordered by Session Percentage - Total Bucket:
sum
Units Ordered
Session Percentage - Total_Bucket
Low 2555
Medium 3176
High 20603
Pivot Table for Units Ordered - B2B by Session Percentage - Total - B2B Bucket:
sum
Units Ordered - B2B
Session Percentage - Total - B2B_Bucket
Low 18
Medium 42
High 214
C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\2793675384.py:2: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior pivot_total = pd.pivot_table(df_num_up1, values='Units Ordered', index='Session Percentage - Total_Bucket', aggfunc=[ 'sum']) C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\2793675384.py:3: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior pivot_b2b = pd.pivot_table(df_num_up1, values='Units Ordered - B2B', index='Session Percentage - Total - B2B_Bucket', aggfunc=[ 'sum'])
*The analysis of Units Ordered based on the Session Percentage - Total Bucket reveals the following: 1.The High bucket has the highest impact on units ordered, with 20,603 units ordered. 2.The Medium bucket shows a significant but lower volume, with 3,176 units ordered. 3.The Low bucket accounts for 2,555 units ordered, indicating that lower session percentages result in fewer units ordered. 4.Similarly, the analysis of Units Ordered - B2B based on the Session Percentage - Total - B2B Bucket shows:
*The High bucket leads with 214 units ordered in the B2B category. 1.The Medium bucket follows with 42 units, while the Low bucket accounts for only 18 units ordered in the B2B space. 2.This suggests that higher session percentages, both overall and for B2B, are strongly correlated with increased order volumes. 3.Targeting strategies to boost session percentages could lead to higher unit sales, especially in the high-performing segments.
# Create pivot tables for each bucket and Units Ordered columns
pivot_total = pd.pivot_table(df_num_up1, values='Units Ordered', index='Page Views Percentage - Total_Bucket', aggfunc=[ 'sum'])
pivot_b2b = pd.pivot_table(df_num_up1, values='Units Ordered - B2B', index='Page Views Percentage - Total - B2B_Bucket', aggfunc=[ 'sum'])
# Print pivot tables
print("Pivot Table for Units Ordered by Session Percentage - Total Bucket:")
print(pivot_total)
print("\nPivot Table for Units Ordered - B2B by Session Percentage - Total - B2B Bucket:")
print(pivot_b2b)
Pivot Table for Units Ordered by Session Percentage - Total Bucket:
sum
Units Ordered
Page Views Percentage - Total_Bucket
Low 2597
Medium 3358
High 20379
Pivot Table for Units Ordered - B2B by Session Percentage - Total - B2B Bucket:
sum
Units Ordered - B2B
Page Views Percentage - Total - B2B_Bucket
Low 14
Medium 43
High 217
C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\1107112162.py:2: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior pivot_total = pd.pivot_table(df_num_up1, values='Units Ordered', index='Page Views Percentage - Total_Bucket', aggfunc=[ 'sum']) C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\1107112162.py:3: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior pivot_b2b = pd.pivot_table(df_num_up1, values='Units Ordered - B2B', index='Page Views Percentage - Total - B2B_Bucket', aggfunc=[ 'sum'])
*The analysis of Units Ordered by Page Views Percentage - Total Bucket shows: 1.The High bucket contributes significantly, with 20,379 units ordered, indicating that higher page view percentages lead to more orders. 2.The Medium bucket accounts for 3,358 units, while the Low bucket results in 2,597 units ordered. 3.For Units Ordered - B2B by Page Views Percentage - Total - B2B Bucket:
*The High bucket dominates with 217 units ordered in the B2B segment. 1.The Medium bucket follows with 43 units, while the Low bucket results in only 14 units ordered in the B2B category. 2.This analysis highlights that higher page view percentages are directly correlated with higher order volumes, both in overall and B2B segments. Focusing efforts on increasing page views could drive greater sales, especially within the high-performing groups
pivot_total = pd.pivot_table(df_num_up1, values='Ordered Product Sales', index='Session Percentage - Total_Bucket', aggfunc=[ 'sum'])
pivot_b2b = pd.pivot_table(df_num_up1, values='Ordered Product Sales - B2B', index='Session Percentage - Total - B2B_Bucket', aggfunc=[ 'sum'])
# Print pivot tables
print("Pivot Table for Product sales Ordered by Session Percentage - Total Bucket:")
print(pivot_total)
print("\nPivot Table for Product Sales - B2B by Session Percentage - Total - B2B Bucket:")
print(pivot_b2b)
Pivot Table for Product sales Ordered by Session Percentage - Total Bucket:
sum
Ordered Product Sales
Session Percentage - Total_Bucket
Low 2519670.54
Medium 2820676.45
High 16023759.13
Pivot Table for Product Sales - B2B by Session Percentage - Total - B2B Bucket:
sum
Ordered Product Sales - B2B
Session Percentage - Total - B2B_Bucket
Low 17582.00
Medium 38378.10
High 172141.02
C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\2580202820.py:2: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior pivot_total = pd.pivot_table(df_num_up1, values='Ordered Product Sales', index='Session Percentage - Total_Bucket', aggfunc=[ 'sum']) C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\2580202820.py:3: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior pivot_b2b = pd.pivot_table(df_num_up1, values='Ordered Product Sales - B2B', index='Session Percentage - Total - B2B_Bucket', aggfunc=[ 'sum'])
*The analysis of Ordered Product Sales by Session Percentage - Total Bucket reveals: 1.The High bucket drives the highest sales, with a total of ₹16,023,759.13, indicating that higher session percentages correlate with significantly higher sales. 2.The Medium bucket contributes ₹2,820,676.45 in sales. 3.The Low bucket results in ₹2,519,670.54, showing a noticeable drop in sales at lower session percentages.
*For Ordered Product Sales - B2B by Session Percentage - Total - B2B Bucket: 1.The High bucket leads with ₹172,141.02 in B2B sales. 2.The Medium bucket follows with ₹38,378.10, and the Low bucket contributes ₹17,582.00. 3.This analysis clearly shows that higher session percentages significantly boost both overall and B2B product sales. Focusing on increasing session percentages, particularly in the high-performing groups, could greatly enhance revenue generation.
pivot_total = pd.pivot_table(df_num_up1, values='Ordered Product Sales', index='Page Views Percentage - Total_Bucket', aggfunc=[ 'sum'])
pivot_b2b = pd.pivot_table(df_num_up1, values='Ordered Product Sales - B2B', index='Page Views Percentage - Total - B2B_Bucket', aggfunc=[ 'sum'])
# Print pivot tables
print("Pivot Table for Product sales Ordered by Page Views Percentage - Total_Bucket:")
print(pivot_total)
print("\nPivot Table for Product Sales - B2B by Page Views Percentage - Total - B2B_Bucket:")
print(pivot_b2b)
Pivot Table for Product sales Ordered by Page Views Percentage - Total_Bucket:
sum
Ordered Product Sales
Page Views Percentage - Total_Bucket
Low 2557387.21
Medium 2956962.89
High 15849756.02
Pivot Table for Product Sales - B2B by Page Views Percentage - Total - B2B_Bucket:
sum
Ordered Product Sales - B2B
Page Views Percentage - Total - B2B_Bucket
Low 13986.00
Medium 37977.10
High 176138.02
C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\2826279556.py:1: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior pivot_total = pd.pivot_table(df_num_up1, values='Ordered Product Sales', index='Page Views Percentage - Total_Bucket', aggfunc=[ 'sum']) C:\Users\Harsh Mittal\AppData\Local\Temp\ipykernel_19328\2826279556.py:2: FutureWarning: The default value of observed=False is deprecated and will change to observed=True in a future version of pandas. Specify observed=False to silence this warning and retain the current behavior pivot_b2b = pd.pivot_table(df_num_up1, values='Ordered Product Sales - B2B', index='Page Views Percentage - Total - B2B_Bucket', aggfunc=[ 'sum'])
*The analysis of Ordered Product Sales by Page Views Percentage - Total Bucket shows:
1.The High bucket leads significantly, with ₹15,849,756.02 in total sales, demonstrating that higher page view percentages result in much higher product sales. 2.The Medium bucket contributes ₹2,956,962.89, while the Low bucket brings in ₹2,557,387.21, showing a sharp decline in sales at lower page view percentages.
*For Ordered Product Sales - B2B by Page Views Percentage - Total - B2B Bucket:
1.The High bucket again dominates with ₹176,138.02 in B2B sales. 2.The Medium bucket accounts for ₹37,977.10, and the Low bucket only generates ₹13,986.00 in B2B sales. 3.This analysis underscores the strong correlation between higher page view percentages and increased product sales, both overall and in the B2B segment. Maximizing page views in the high-performing categories could greatly enhance sales performance.